| nettime's_roving_reporter on Wed, 10 Mar 2004 18:20:33 +0100 (CET) |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
| <nettime> WiReD: 'infectious blogs' |
[ via tbyfield; the lynx output was de-doubleclickified.]
< http://wired.com/news/print/0,1294,62537,00.html >
Warning: Blogs Can Be Infectious
By [17]Amit Asaravala
Story location:
[18]http://www.wired.com/news/culture/0,1284,62537,00.html
02:00 AM Mar. 05, 2004 PT
The most-read webloggers aren't necessarily the ones with the most
original ideas, say researchers at Hewlett-Packard Labs.
Using newly developed techniques for graphing the flow of information
between blogs, the researchers have discovered that authors of popular
blog sites regularly borrow topics from lesser-known bloggers -- and
they often do so without attribution.
These findings are important to sociologists who are interested in
learning how ideas grow from isolated topics into full-blown epidemics
that "infect" large populations. Such an understanding is also
important to marketers, who hope to be able to pitch products and
ideas directly to the most influential people in a given group.
"There is a lot of speculation that really important people are highly
connected, but really, we wonder if the highly connected people just
listen to the important people," said Lada Adamic, one of the four
researchers working on the project.
To satisfy their curiosity, the researchers began analyzing data from
Intelliseek's [21]BlogPulse Web crawler, which regularly mines
thousands of blogs for references to people, places and events.
When they plotted the links and topics shared by various sites, they
discovered that topics would often appear on a few relatively unknown
blogs days before they appeared on more popular sites.
"What we're finding is that the important people on the Web are not
necessarily the people with the most explicit links (back to their
sites), but the people who cause epidemics in blog networks," said
researcher Eytan Adar.
These infectious people can be hard to find because they do not always
receive attribution for being the first to point to an interesting
idea or news item.
Indeed, the team at HP Labs found that when an idea infected at least
10 blogs, 70 percent of the blogs did not provide links back to
another blog that had previously mentioned the idea.
To get past this obstacle, the researchers developed techniques to
infer where information might have come from, based on the
similarities in text, links and infection rates.
For instance, if Blog A used the words "furry germs" to link to an
infectious topic like [22]Giantmicrobes just days after Blog B in the
same social circle used the exact same words and link, that would be a
good sign that Blog A copied Blog B.
The researchers have incorporated their techniques into a search
algorithm they call iRank. Unlike Google's [23]PageRank algorithm,
which ranks websites based on overall popularity, the iRank algorithm
ranks sites based on how good they are at injecting ideas into the
mainstream.
"A lot of sites that get listed by search engines as most relevant are
not always the most relevant," said Adar. "For instance, Slashdot
often gets listed at the top, but it's just an aggregator. I may want
to go to the source."
Adar and Adamic say it's too soon to tell if iRank will be
incorporated into popular search engines.
For one thing, they plan to refine the algorithm after seeing how it
works on more data. They would also like to modify the algorithm to
resist manipulation from Google-bomb-type attacks, where collaborators
link to each other's sites to boost themselves in Google's ranking
mechanism.
In the meantime, the team has made some of its research available
online in the form of the [24]Blog Epidemic Analyzer, a Java program
that reveals the implicit and inferred links between blogs in an
interactive, visual form.
"Blogs are helping us get a better understanding of how things happen
on the Internet," said Adar. "We're hopeful that in being able to do
this research, we can apply the technology to other information, like
e-mail, to improve productivity."
© [34]Copyright 2004, Lycos, Inc. All Rights Reserved.
Your use of this website constitutes acceptance of the Lycos
Note: You are reading this message either because you can not see our
css files (served from Akamai for performance reasons), or because you
do not have a standards-compliant browser. Read our [37]design notes
for details.
References
Visible links
17. http://wired.com/news/feedback/mail/1,2330,761,00.html
18. http://www.wired.com/news/culture/0,1284,62537,00.html
21. http://www.blogpulse.com/
22. http://www.giantmicrobes.com/
23. http://www.google.com/technology/
24. http://www-idl.hpl.hp.com/blogstuff/index.html
# distributed via <nettime>: no commercial use without permission
# <nettime> is a moderated mailing list for net criticism,
# collaborative text filtering and cultural politics of the nets
# more info: majordomo@bbs.thing.net and "info nettime-l" in the msg body
# archive: http://www.nettime.org contact: nettime@bbs.thing.net